Methods and compositions for correlating genetic markers with risk of aggressive prostate cancer

ABSTRACT

The present invention provides a method of identifying a subject as having an increased risk of having or developing aggressive prostate cancer, comprising detecting in the subject the presence of various polymorphisms associated with an increased risk of having or developing aggressive prostate cancer.

STATEMENT OF PRIORITY

This application is a continuation application of, and claims priorityto, U.S. application Ser. No. 13/344,907, filed Jan. 6, 2012, whichclaims the benefit, under 35 U.S.C. §119(e), of U.S. ProvisionalApplication Ser. No. 61/430,352, filed Jan. 6, 2011, the entire contentsof each of which are incorporated by reference herein.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with government support under Grant Nos.CA129684, CA106523, CA105055, CA95052, CA1125117, CA133009 and CA131338awarded by the National Cancer Institute and Grant Nos. PC051264 andW81XWH-09-1-0488 awarded by the Department of Defense. The governmenthas certain rights in the invention.

FIELD OF THE INVENTION

The present invention provides methods and compositions directed toidentification of genetic markers associated with prostate cancer.

BACKGROUND OF THE INVENTION

Prostate cancer accounts for one-fourth of all cancer diagnoses in menin the United States, with an estimated 192,280 new cases in 2009 (1).Although most men will have an indolent form of the disease, aggressiveprostate cancers are currently the second leading cause of cancer deathsin men in the United States. Most cases of prostate cancer are diagnosedas a result of having an elevated serum level of prostate-specificantigen (PSA). PSA-based disease screening leading to early detectionand treatment of prostate cancer (PCa) has contributed to the reductionin mortality observed for this disease in the United States over thepast several years (1). However, results from two large randomizedtrials in Europe and the US provide strong evidence that PSA-basedscreening for PCa is associated with a high risk of overdiagnosis (2,3).In the European trial, PSA screening was associated with decreased PCarelated mortalitiy but at a great cost: ˜1,410 men needed to bescreened, and 48 additional PCa cases would need to be treated toprevent one death from PCa (2). Although interpretation of thesefindings is still a subject of discussion, the current inability toaccurately distinguish risk for life-threatening, aggressive PCa fromthe overwhelming majority of indolent cases contributes to the dilemma.

Recent breakthroughs in genome-wide association studies (GWAS) have ledto the discovery of more than two dozen reported single nucleotidepolymorphisms (SNPs) that are associated with PCa risk by comparing menwith and without PCa using case-control study designs (6-25).Unfortunately, none of these PCa risk associated SNPs consistentlydistinguishes risk for more or less aggressive cancer (26-28), nor arethey associated with prostate cancer-specific mortality (29). As aresult, there has been much debate regarding the clinical utility ofthese SNPs as a risk stratification tool (30,31). Clearly, analternative approach is needed to identify genetic markers thatdistinguish those men who are at risk for developing more aggressivePCa.

The present invention overcomes previous shortcomings in the art byidentifying significant statistical associations between genetic markersand prostate cancer. Thus, the present invention provides methods andcompositions for identifying a subject at increased risk of developingaggressive prostate cancer by detecting the genetic markers of thisinvention in the subject.

SUMMARY OF THE INVENTION

The present invention provides a method of identifying a human subjectas having an increased risk of developing aggressive prostate cancer,comprising detecting in a nucleic acid sample from the subject a Tallele at single nucleotide polymorphism rs4054823 in chromosome region17p12, wherein the detection of said allele identifies the subject ashaving an increased risk of developing aggressive prostate cancer.

Also provided herein is a method of identifying a human subject ashaving an increased risk of developing aggressive prostate cancer,comprising detecting in a nucleic acid sample from the subject an allelethat is in linkage disequilibrium with the T allele at single nucleotidepolymorphism rs4054823 in chromosome region 17p12, wherein the detectionof said allele identifies the subject as having an increased risk ofdeveloping aggressive prostate cancer.

Furthermore, the present invention provides a kit containingoligonucleotides and other reagents for detecting an allele orcombination of alleles of this invention.

Additionally provide herein is a computer-assisted method of identifyinga proposed treatment for aggressive prostate cancer as an effectiveand/or appropriate treatment for a subject carrying a genetic markercorrelated with aggressive prostate cancer, comprising the steps of: (a)storing a database of biological data for a plurality of subjects, thebiological data that is being stored including for each of saidplurality of subjects: (i) a treatment type, (ii) at least one geneticmarker associated with aggressive prostate cancer, and (iii) at leastone disease progression measure for prostate cancer from which treatmentefficacy can be determined; and then (b) querying the database todetermine the dependence on said genetic marker of the effectiveness ofa treatment type in treating prostate cancer, thereby identifying aproposed treatment as an effective and/or appropriate treatment for asubject carrying a genetic marker correlated with prostate cancer.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Flow chart of the study design. Numbers of subjects with more orless aggressive prostate cancer in each study population are indicatedin parentheses.

FIG. 2. Frequency of TT genotype of rs4054823 at 17p12 among PCapatients from the (A) JHH population and (B) CAPS population of Sweden.

DETAILED DESCRIPTION OF THE INVENTION

The present invention is explained in greater detail below. Thisdescription is not intended to be a detailed catalog of all thedifferent ways in which the invention may be implemented, or all thefeatures that may be added to the instant invention. For example,features illustrated with respect to one embodiment may be incorporatedinto other embodiments, and features illustrated with respect to aparticular embodiment may be deleted from that embodiment. In addition,numerous variations and additions to the various embodiments suggestedherein will be apparent to those skilled in the art in light of theinstant disclosure, which do not depart from the instant invention.Hence, the following specification is intended to illustrate someparticular embodiments of the invention, and not to exhaustively specifyall permutations, combinations and variations thereof.

The present invention is based on the unexpected discovery of particularalleles of single nucleotide polymorphisms (SNPs) that are statisticallyassociated with an increased risk of developing aggressive prostatecancer. There are numerous benefits of carrying out the methods of thisinvention to identify a subject having an increased risk of developingaggressive prostate cancer, including but not limited to, identifyingsubjects who are good candidates for prophylactic and/or therapeutictreatment, and screening for cancer at an earlier time or morefrequently than might otherwise be indicated, to increase the chances ofearly detection of an aggressive prostate cancer.

Thus, in one aspect, the present invention provides a method ofidentifying a subject (e.g., a human subject) as having an increasedrisk of developing aggressive prostate cancer, comprising detecting in anucleic acid sample from the subject a T allele at single nucleotidepolymorphism rs4054823 in chromosome region 17p12, wherein the detectionof said alleles identifies the subject as having an increased risk ofdeveloping aggressive prostate cancer.

The present invention further provides a method of identifying a subjectas having an increased risk of developing aggressive prostate cancer,comprising detecting in a nucleic acid sample from the subject an allelein linkage disequilibrium (LD) with the T allele at single nucleotidepolymorphism rs4054823 in chromosome region 17p12. Alleles in LD withthe T allele at single nucleotide polymorphism rs4054823 in chromosomeregion 17p12 are provided herein in Table 1. Such alleles can bedetected individually (e.g., detection of a risk allele at a single SNPlocation) as well as in any combination (e.g., detection of a riskallele at more than one (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) SNPlocation). In some embodiments, when analyzed in combination, thecombination can comprise detection of the T allele at rs4054823 inaddition to detection of one or more of the alleles of Table 1. In someembodiments, the combination can be a (e.g., any) combination of thealleles of Table 1 without the T allele of rs4054823.

In some embodiments of this invention, the subject can be homozygous forthe T allele at single nucleotide polymorphism rs4054823 in chromosomeregion 17p12. In other embodiments, the subject can be heterozygous forthe T allele at single nucleotide polymorphism rs4054823 in chromosomeregion 17p12. The presence of the T allele, either homozygously orheterozygously, at single nucleotide polymorphism rs4054823 inchromosome region 17p12 identifies the subject as having an increasedrisk of developing aggressive prostate cancer. In the methods providedherein wherein a combination of alleles is analyzed, the subject can beheterozygous or homozygous for any given allele in any combinationrelative to the other alleles in the combination.

In certain embodiments of this invention, the methods described hereincan be employed to identify 1) a subject at increased or decreased riskof a more aggressive form of prostate cancer (e.g., having a Gleasonscore of 7 (4+3) to 10), 2) a subject at increased or decreased risk ofa poor prognosis (e.g., increased likelihood the cancer willmetastasize, will be poorly responsive to treatment and/or will lead todeath) once cancer has been diagnosed in the subject; and/or 3) asubject at increased or decreased risk of an early age of onset ofprostate cancer (e.g., aggressive prostate cancer), by identifying inthe subject the alleles of this invention.

It is further contemplated that the methods of this invention can becarried out to diagnose aggressive prostate cancer in a subject, bydetecting the T allele of SNP rs4054823 and/or detecting any combinationof the alleles of this invention in nucleic acid from the subject.

In further aspects, the present invention provides a kit for carryingout the methods of this invention, wherein the kit can compriseoligonucleotides (e.g., primers, probes, primer/probe sets, etc.),reagents, buffers, etc., as would be known in the art, for the detectionof the alleles of this invention in a nucleic acid sample. For example,a primer or probe can comprise a contiguous nucleotide sequence that iscomplementary (e.g., fully (100%) complementary or partially (50%, 60%,70%, 80%, 90%, 95%, etc.) complementary) to a region comprising anallele of this invention. In particular embodiments, a kit of thisinvention will comprise primers and probes that allow for the specificdetection of the alleles of this invention. Such a kit can furthercomprise blocking probes, labeling reagents, blocking agents,restriction enzymes, antibodies, sampling devices, positive and negativecontrols, etc., as would be well known to those of ordinary skill in theart. Thus, in some embodiments, the present invention provides a kitcomprising oligonucleotides to detect the T allele of single nucleotidepolymorphism rs4054823 in chromosome region 17p12 in a nucleic acidsample. In further embodiments, the present invention provides a kitcomprising oligonucleotides to detect an allele or combination ofalleles in linkage disequilibrium with the T allele of single nucleotidepolymorphism rs4054823 in chromosome region 17p12 in a nucleic acidsample, such as the alleles set forth in Table 1 herein. Sucholigonucleotides can be identified and prepared and employed in methodsaccording to the teachings and protocols described herein and as arewell known in the art.

DEFINITIONS

As used herein, “a,” “an” or “the” can mean one or more than one. Forexample, “a” cell can mean a single cell or a multiplicity of cells.

Also as used herein, “and/or” refers to and encompasses any and allpossible combinations of one or more of the associated listed items, aswell as the lack of combinations when interpreted in the alternative(“or”).

Furthermore, the term “about,” as used herein when referring to ameasurable value such as an amount of a compound or agent of thisinvention, dose, time, temperature, and the like, is meant to encompassvariations of ±20%, ±10%, ±5%, ±1%, ±0.5%, or even ±0.1% of thespecified amount.

As used herein, the term “prostate cancer” or “PCa” describes anuncontrolled (malignant) growth of cells in the prostate gland, which islocated at the base of the urinary bladder and is responsible forhelping control urination as well as forming part of the semen. Symptomsof prostate cancer can include, but are not limited to, urinary problems(e.g., not being able to urinate; having a hard time starting orstopping the urine flow; needing to urinate often, especially at night;weak flow of urine; urine flow that starts and stops; pain or burningduring urination), difficulty having an erection, blood in the urineand/or semen, and/or frequent pain in the lower back, hips, and/or upperthighs.

As used herein, the term “aggressive prostate cancer” means prostatecancer that is poorly differentiated, having a Gleason grade of 7 orabove. An “indolent prostate cancer” means prostate cancer having aGleason grade below 7 (e.g., 6 or less). The Gleason grading system isthe most commonly used method for grading PCa and is well known in theart.

All the SNP positions described herein are based on Build 36.

The term “chromosome region” as used herein refers to a part of achromosome defined either by anatomical details, especially by banding,or by its linkage groups. The particular chromosome region of thisinvention is 17p12.

Also as used herein, “linked” describes a region of a chromosome that isshared more frequently in family members or members of a populationmanifesting a particular phenotype and/or affected by a particulardisease or disorder, than would be expected or observed by chance,thereby indicating that the gene or genes or other identified marker(s)within the linked chromosome region contain or are associated with anallele that is correlated with the phenotype and/or presence of adisease or disorder (e.g., aggressive PCa), or with an increased ordecreased likelihood of the phenotype and/or of the disease or disorder.Once linkage is established, association studies (linkagedisequilibrium) can be used to narrow the region of interest or toidentify the marker (e.g., allele or haplotype) correlated with thephenotype and/or disease or disorder.

Furthermore, as used herein, the term “linkage disequilibrium” or “LD”refers to the occurrence in a population of two or more (e.g., 2, 3, 4,5, 6, 7, 8, 9, 10, etc.) linked alleles at a frequency higher or lowerthan expected on the basis of the gene frequencies of the individualgenes. Thus, linkage disequilibrium describes a situation where allelesoccur together more often than can be accounted for by chance, whichindicates that the two or more alleles are physically close on a DNAstrand.

The term “genetic marker” or “polymorphism” as used herein refers to acharacteristic of a nucleotide sequence (e.g., in a chromosome) that isidentifiable due to its variability among different subjects (i.e., thegenetic marker or polymorphism can be a single nucleotide polymorphisman allele of a single nucleotide polymorphism, a restriction fragmentlength polymorphism, a microsatellite, a deletion of nucleotides, anaddition of nucleotides, a substitution of nucleotides, a repeat orduplication of nucleotides, a translocation of nucleotides, and/or anaberrant or alternate splice site resulting in production of a truncatedor extended form of a protein, etc., as would be well known to one ofordinary skill in the art).

A “single nucleotide polymorphism” (SNP) in a nucleotide sequence is agenetic marker that is polymorphic for two (or in some case three orfour) alleles. SNPs can be present within a coding sequence of a gene,within noncoding regions of a gene and/or in an intergenic (e.g.,intron) region of a gene. A SNP in a coding region in which both formslead to the same polypeptide sequence is termed synonymous (i.e., asilent mutation) and if a different polypeptide sequence is produced,the alleles of that SNP are non-synonymous. SNPs that are not in proteincoding regions can still have effects on gene splicing, transcriptionfactor binding and/or the sequence of non-coding RNA.

The SNP nomenclature provided herein refers to the official ReferenceSNP (rs) identification number as assigned to each unique SNP by theNational Center for Biotechnological Information (NCBI), which isavailable in the GenBank® database.

In some embodiments, the term genetic marker is also intended todescribe a phenotypic effect of an allele or haplotype, including forexample, an increased or decreased amount of a messenger RNA, anincreased or decreased amount of protein, an increase or decrease in thecopy number of a gene, production of a defective protein, tissue ororgan, etc., as would be well known to one of ordinary skill in the art.

An “allele” as used herein refers to one of two or more alternativeforms of a nucleotide sequence at a given position (locus) on achromosome (e.g., at a single nucleotide polymorphism). An allele can bea nucleotide present in a nucleotide sequence that makes up the codingsequence of a gene and/or an allele can be a nucleotide in a non-codingregion of a gene (e.g., in a genomic sequence). A subject's genotype fora given gene is the set of alleles the subject happens to possess. Asnoted herein, an individual can be heterozygous or homozygous for anyallele of this invention.

Also as used herein, a “haplotype” is a set of alleles on a singlechromatid that are statistically associated. It is thought that theseassociations, and the identification of a few alleles of a haplotypeblock, can unambiguously identify all other alleles in its region. Theterm “haplotype” is also commonly used to describe the geneticconstitution of individuals with respect to one member of a pair ofallelic genes; sets of single alleles or closely linked genes that tendto be inherited together.

The terms “increased risk” and “decreased risk” as used herein definethe level of risk that a subject has of developing aggressive prostatecancer, as compared to a control subject that does not have the allelesof this invention in the control subject's nucleic acid.

A sample of this invention can be any sample containing nucleic acidfrom a subject, as would be well known to one of ordinary skill in theart. Nonlimiting examples of a sample of this invention include a cell,a body fluid, a tissue, biopsy material, a washing, a swabbing, etc., aswould be well known in the art.

A subject of this invention is any animal that is susceptible toprostate cancer as defined herein and can include, for example, humans,as well as animal models of prostate cancer (e.g., rats, mice, dogs,nonhuman primates, etc.). In some aspects of this invention, the subjectcan be Caucasian (e.g., white; European-American; Hispanic), as well asof black African ancestry (e.g., black; African, Sub-Saharan African,African American; African-European; African-Caribbean, etc.) or Asian.In further aspects of this invention, the subject can have a familyhistory of prostate cancer or aggressive prostate cancer (e.g., havingat least one first degree relative having or diagnosed with prostatecancer or aggressive prostate cancer) and in some embodiments, thesubject does not have a family history of prostate cancer or aggressiveprostate cancer. Additionally a subject of this invention can have adiagnosis of prostate cancer or aggressive prostate cancer in certainembodiments and in other embodiments, a subject of this invention doesnot have a diagnosis of prostate cancer or aggressive prostate cancer.In yet further embodiments, the subject of this invention can have anelevated prostate-specific antigen (PSA) level and in other embodiments,the subject of this invention can have a normal or non-elevated PSAlevel. In some embodiments, the PSA level of the subject may not beknown and/or has not been measured.

As used herein, “nucleic acid” encompasses both RNA and DNA, includingcDNA, genomic DNA, mRNA, synthetic (e.g., chemically synthesized) DNAand chimeras, fusions and/or hybrids of RNA and DNA. The nucleic acidcan be double-stranded or single-stranded. Where single-stranded, thenucleic acid can be a sense strand or an antisense strand. In someembodiments, the nucleic acid can be synthesized using oligonucleotideanalogs or derivatives (e.g., inosine or phosphorothioate nucleotides,etc.). Such oligonucleotides can be used, for example, to preparenucleic acids that have altered base-pairing abilities or increasedresistance to nucleases.

An “isolated nucleic acid” is a nucleotide sequence that is notimmediately contiguous with nucleotide sequences with which it isimmediately contiguous (one on the 5′ end and one on the 3′ end) in thenaturally occurring genome of the organism from which it is derived orin which it is detected or identified. Thus, in one embodiment, anisolated nucleic acid includes some or all of the 5′ non-coding (e.g.,promoter) sequences that are immediately contiguous to a codingsequence. The term therefore includes, for example, a recombinant DNAthat is incorporated into a vector, into an autonomously replicatingplasmid or virus, or into the genomic DNA of a prokaryote or eukaryote,or which exists as a separate molecule (e.g., a cDNA or a genomic DNAfragment produced by PCR or restriction endonuclease treatment),independent of other sequences. It also includes a recombinant DNA thatis part of a hybrid nucleic acid encoding an additional polypeptide orpeptide sequence.

The term “isolated” can refer to a nucleic acid or polypeptide that issubstantially free of cellular material, viral material, and/or culturemedium (e.g., when produced by recombinant DNA techniques), or chemicalprecursors or other chemicals (when chemically synthesized). Moreover,an “isolated fragment” is a fragment of a nucleic acid or polypeptidethat is not naturally occurring as a fragment and would not be found inthe natural state.

The term “oligonucleotide” refers to a nucleic acid sequence of at leastabout five nucleotides to about 500 nucleotides (e.g. 5, 6, 7, 8, 9, 10,12, 15, 18, 20, 21, 22, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80,85, 90, 100, 125, 150, 175, 200, 250, 300, 350, 400, 450 or 500nucleotides). In some embodiments, for example, an oligonucleotide canbe from about 15 nucleotides to about 30 nucleotides, or about 20nucleotides to about 25 nucleotides, which can be used, for example, asa primer in a polymerase chain reaction (PCR) amplification assay and/oras a probe in a hybridization assay or in a microarray. Oligonucleotidesof this invention can be natural or synthetic, e.g., DNA, RNA, PNA, LNA,modified backbones, etc., as are well known in the art.

The present invention further provides fragments of the nucleic acids ofthis invention, which can be used, for example, as oligonucleotides,primers and/or probes. Such fragments or oligonucleotides can bedetectably labeled or modified, for example, to include and/orincorporate a restriction enzyme cleavage site when employed as a primerin an amplification (e.g., PCR) assay.

The detection of a polymorphism, genetic marker or allele of thisinvention can be carried out according to various protocols standard inthe art and as described herein for analyzing nucleic acid samples andnucleotide sequences, as well as identifying specific nucleotides in anucleotide sequence.

For example, nucleic acid can be obtained from any suitable sample fromthe subject that will contain nucleic acid and the nucleic acid can thenbe prepared and analyzed according to well-established protocols for thepresence of genetic markers according to the methods of this invention.In some embodiments, analysis of the nucleic acid can be carried byamplification of the region of interest according to amplificationprotocols well known in the art (e.g., polymerase chain reaction, ligasechain reaction, strand displacement amplification, transcription-basedamplification, self-sustained sequence replication (3 SR), Qβ replicaseprotocols, nucleic acid sequence-based amplification (NASBA), repair,chain reaction (RCR) and boomerang DNA amplification (BDA), etc.). Theamplification product can then be visualized directly in a gel bystaining or the product can be detected by hybridization with adetectable probe. When amplification conditions allow for amplificationof all allelic types of a genetic marker, the types can be distinguishedby a variety of well-known methods, such as hybridization with anallele-specific probe, secondary amplification with allele-specificprimers, by restriction endonuclease digestion, and/or byelectrophoresis. Thus, the present invention further providesoligonucleotides for use as primers and/or probes for detecting and/oridentifying genetic markers according to the methods of this invention.

In some embodiments of this invention, detection of an allele orcombination of alleles of this invention can be carried out by anamplification reaction and single base extension. In particularembodiments, the product of the amplification reaction and single baseextension is spotted on a silicone chip.

In yet additional embodiments, detection of an allele or combination ofalleles of this invention can be carried out by matrix-assisted laserdesorption/ionization-time of flight mass spectrometry (MALDI-TOF-MS).

It is further contemplated that the detection of an allele orcombination of alleles of this invention can be carried out by variousmethods that are well known in the art, including, but not limited tonucleic acid sequencing, hybridization assay, restriction endonucleasedigestion analysis, electrophoresis, and any combination thereof.

The genetic markers (e.g., alleles) of this invention are correlatedwith (i.e., identified to be statistically associated with) aggressiveprostate cancer as described herein according to methods well known inthe art and as disclosed in the Examples provided herein forstatistically correlating genetic markers with various phenotypictraits, including disease states and pathological conditions as well asdetermining levels of risk associated with developing a particularphenotype, such as a disease or pathological condition. In general,identifying such correlation involves conducting analyses that establisha statistically significant association and/or a statisticallysignificant correlation between the presence of a genetic marker or acombination of markers and the phenotypic trait in a population ofsubjects and controls (e.g., a population of subjects in whom thephenotype is not present or has not been detected). The correlation caninvolve one or more than one genetic marker of this invention (e.g.,two, three, four, five, or more) in any combination. An analysis thatidentifies a statistical association (e.g., a significant association)between the marker or combination of markers and the phenotypeestablishes a correlation between the presence of the marker orcombination of markers in a population of subjects and the particularphenotype being analyzed. A level of risk (e.g., increased or decreased)can then be determined for an individual on the basis of suchpopulation-based analyses.

Thus, in certain embodiments, the present invention provides a method ofscreening a subject for a genetic marker (e.g., an allele at a SNP site)that is associated with aggressive prostate cancer, comprising: a)performing a population based study to detect polymorphisms (e.g.,alleles) in a group of subjects with aggressive prostate cancer and agroup of control subjects; b) identifying polymorphisms in theaggressive prostrate cancer group of subjects that are statisticallyassociated with the presence of aggressive prostate cancer; and c)screening a subject for the presence of the polymorphisms identified instep (b).

The present invention further provides a method of identifying aneffective and/or appropriate (i.e., for a given subject's particularcondition or status) treatment regimen for a subject with aggressiveprostate cancer, comprising detecting one or more of the polymorphismsand genetic markers associated with aggressive prostate cancer of thisinvention in the subject, wherein the one or more polymorphisms andgenetic markers are further statistically correlated with an effectiveand/or appropriate treatment regimen for aggressive prostate canceraccording to protocols as described herein and as are well known in theart.

Also provided is a method of identifying an effective and/or appropriatetreatment regimen for a subject with aggressive prostate cancer,comprising: a) correlating the presence of one or more genetic markersof this invention in a test subject or population of test subjects withaggressive prostate cancer for whom an effective and/or appropriatetreatment regimen has been identified; and b) detecting the one or moremarkers of step (a) in the subject, thereby identifying an effectiveand/or appropriate treatment regimen for the subject.

Further provided is a method of correlating a polymorphism or geneticmarker of this invention with an effective and/or appropriate treatmentregimen for aggressive prostate cancer, comprising: a) detecting in asubject or a population of subjects with aggressive prostate cancer andfor whom an effective and/or appropriate treatment regimen has beenidentified, the presence of one or more genetic markers or polymorphismsof this invention; and b) correlating the presence of the one or moregenetic markers of step (a) with an effective treatment regimen foraggressive prostate cancer.

Examples of treatment regimens for prostate cancer are well known in theart. Subjects who respond well to particular treatment protocols can beanalyzed for specific genetic markers and a correlation can beestablished according to the methods provided herein. Alternatively,subjects who respond poorly to a particular treatment regimen can alsobe analyzed for particular genetic markers correlated with the poorresponse. Then, a subject who is a candidate for treatment foraggressive prostate cancer can be assessed for the presence of theappropriate genetic markers and the most effective and/or appropriatetreatment regimen can be provided as early as possible.

In some embodiments, the methods of correlating genetic markers withtreatment regimens of this invention can be carried out using a computerdatabase. Thus the present invention provides a computer-assisted methodof identifying a proposed treatment for aggressive prostate cancerand/or appropriate treatment for a subject carrying a genetic markercorrelated with aggressive prostate cancer. The method involves thesteps of (a) storing a database of biological data for a plurality ofsubjects, the biological data that is being stored including for each ofsaid plurality of subjects, for example, (i) a treatment type, (ii) atleast one genetic marker associated with aggressive prostate cancer and(iii) at least one disease progression measure for aggressive prostatecancer from which treatment efficacy can be determined; and then (b)querying the database to determine the correlation between the presenceof said genetic marker and the effectiveness of a treatment type intreating aggressive prostate cancer, to thereby identify a proposedtreatment as an effective for aggressive prostate cancer and/or anappropriate treatment for a subject carrying a genetic marker correlatedwith aggressive prostate cancer. In such methods, the genetic markerassociated with aggressive prostate cancer can be a T allele in singlenucleotide polymorphism rs4054823 in chromosome region 17p12.

In some embodiments, treatment information for a subject is entered intothe database (through any suitable means such as a window or textinterface), genetic marker information for that subject is entered intothe database, and disease progression information is entered into thedatabase. These steps are then repeated until the desired number ofsubjects has been entered into the database. The database can then bequeried to determine whether a particular treatment is effective forsubjects carrying a particular marker or combination of markers, noteffective for subjects carrying a particular marker or combination ofmarkers, etc. Such querying can be carried out prospectively orretrospectively on the database by any suitable means, but is generallydone by statistical analysis in accordance with known techniques, asdescribed herein.

The following examples are not intended to limit the scope of the claimsto the invention, but are rather intended to be exemplary of certainembodiments. Any variations in the exemplified methods that occur to theskilled artisan are intended to fall within the scope of the presentinvention. As will be understood by one skilled in the art, there areseveral embodiments and elements for each aspect of the claimedinvention, and all combinations of different elements are herebyanticipated, so the specific combinations exemplified herein are not tobe construed as limitations in the scope of the invention as claimed. Ifspecific elements are removed or added to the group of elementsavailable in a combination, then the group of elements is to beconstrued as having incorporated such a change.

EXAMPLES

Abstract.

Autopsy studies suggest that most aging men will develop lesions that,if detected clinically, would be diagnosed as prostate cancer (PCa).Most of these cancers are indolent and remain localized; however, asubset of PCa is aggressive and accounts for more than 27,000 deaths inthe United States annually. Identification of factors specificallyassociated with risk for more aggressive PCa is urgently needed toreduce overdiagnosis and overtreatment of this common disease. To searchfor such factors, the frequencies of SNPs were compared among PCapatients who were defined as having either more aggressive or lessaggressive disease in four populations examined in the Genetic Markersof Susceptibility (CGEMS) study performed by the National CancerInstitute. SNPs showing possible associations with disease severity werefurther evaluated in an additional three independent study populationsfrom the United States and Sweden. In total, 4,829 and 12,205 patientswith more and less aggressive disease, respectively, were studied. Itwas found that the frequency of the TT genotype of SNP rs4054823 at17p12 was consistently higher among patients with more aggressivecompared with less aggressive disease in each of the seven populationsstudied, with an overall P value of 2.1×10⁻⁸ under a recessive model,exceeding the conservative genome-wide significance level. Thedifference in frequency was largest between patients with high-grade,non-organ-confined disease compared with those with low-grade,organ-confined disease. This study demonstrates that inherited variantspredisposing to aggressive but not indolent PCa exist in the genome anddemonstrates the clinical potential of such variants as potential earlymarkers for risk of aggressive PCa.

Study Subjects.

Seven independent populations were included in this study (Table 2). Thefirst four populations were from the publicly available CGEMS study, andinclude the Prostate, Lung, Colon and Ovarian (PLCO) Cancer ScreeningTrial, the American Cancer Society Cancer Prevention Study II (CPS-II),the Health Professionals Follow-up Study (HPFS), and theAlpha-Tocopherol, Beta-Carotene Cancer Prevention Study (ATBC) (9, 11).PCa aggressiveness was defined by the CGEMS study as follows: patientswith clinical stage T3/T4 or Gleason score of 7 or higher (stage andgrade designations as described herein) based on biopsy specimens wereclassified as having more aggressive disease, whereas the remainingpatients were classified as having less aggressive disease.

The other three populations were from our collaborative research group,including a hospital-based case series from the Johns Hopkins Hospital(JHH), and two population-based studies based on the National ProstateCancer Register of Sweden; a case-control study; CAncer Prostate inSweden (CAPS) (41, 26), and a case series of PCa patients treated forlocalized PCa (PROCAP) (42, 43).

PCa patients from the CAPS study were identified and recruited from fourregional cancer registries in Sweden, diagnosed between July 2001 andOctober 2003. Patients were classified as having more aggressive diseaseif their cancers met any of the following criteria: advanced stage asevidenced by disease spread outside of the prostate; presence of cancerin the lymph nodes or other metastatic sites (clinical stage T3/T4, N+,M+, respectively); presence of poorly differentiated cancer at biopsy asindicated by a high Gleason score (i.e., 4+4=8 or higher; Gleason scoresare the sum of the two most prevalent histologic patterns, rated on ascale of 1-5, with 5 being the most poorly differentiated); or a serumPSA level associated with a high likelihood of extensive disease (>50ng/mL (n=1,231). Otherwise, the patients were classified as having lessaggressive disease (n=1,619) (Table 4).

The PCa patients from the JHH study were men who underwent radicalprostatectomy for treatment of PCa at JHH from Jan. 1, 1999, throughDec. 31, 2008. Because of the non-JHH populations analyzed in this studyincluding only individuals of European descent, the JHH population wassimilarly confined. Tumors were graded and staged after resection; thosewith Gleason scores of 7, with the most prevalent pattern being 4, orhigher, or stage T3b or higher, or N+ or M+ were defined as moreaggressive disease (n=1,408). Tumors with Gleason score of 7 with mostprevalent pattern 3, or lower and no evidence of disease dissemination(pathologic stage T2/N0/M0) were defined as having less aggressivedisease (n=4,318) (Table 5).

The PROCAP study was a cohort of PCa patients diagnosed predominantlywith clinically localized disease between 1997 and 2002 and recruitedfrom the National Prostate Cancer Register of Sweden. Among 4,356patients, 210 were classified as having more aggressive disease(clinical stage T3/T4, N+, M+, Gleason Score ≧8, or pretreatment serumPSA ≧50 ng/mL). The remaining 4,159 patients were classified as havingless aggressive disease.

SNPs and Genotyping Methods.

The genotyping data for ˜27,000 SNPs in four CGEMS study populations(PLCO, CPS-II, HPFS, and ATBC) were publically available. These SNPswere genotyped because they were significantly associated with PCa riskin the first-stage GWAS of the CGEMS study (PLCO) using a case-controlanalysis (11). Individual genotype data from PLCO were obtained throughan approved data request application. Summary genotype information fromCPS-II, HPFS, and ATBC were downloaded from a publicly accessible CGEMSwebsite (cgems.cancer.gov/data/).

SNP genotyping in the CAPS, JHH, and PROCAP subjects was performed usingthe MassARRAY iPLEX genotyping system (Sequenom) at Wake ForestUniversity. Duplicate test samples and two water samples (PCR negativecontrols) that were blinded to the technician were included in each96-well plate. The rate of concordant results between 100 duplicatesamples was >99%.

Statistical Analysis.

Allele frequency differences between two groups of patients were testedfor each SNP using a χ² test with 1 degree of freedom within eachpopulation. The allelic odds ratio (OR) and 95% confidence interval (95%CI) were estimated based on a multiplicative model. Genotype frequencydifferences between two groups of patients were also tested using both adominant and a recessive model for SNPs that were confirmed in an alleletest from multiple populations. Results from multiple populations werecombined using a Mantel-Haenszel model in which the populations wereallowed to have different allele frequencies but were assumed to have acommon OR. The homogeneity of ORs among different study populations wastested using Breslow-Day χ² test.

For SNPs that were confirmed to be significantly associated withaggressiveness of PCa, a χ² test using a 2×K table was performed forGleason scores and T-stage, in which K is the number of possiblecategories within each variable. All reported P values were based on atwo-sided test.

To identify inherited genetic markers that are associated withaggressiveness of PCa, publicly available genotype data were analyzedfor ˜27,000 SNPs across the genome among 1,980 patients with moreaggressive disease and 2,109 patients with less aggressive disease fromfour CGEMS study populations (PLCO, CPS-II, HPFS, and ATBC) using acase-case analysis (FIG. 1, Table 2). Based on the results of a combinedallelic test, a subset of SNPs (n=74) was selected for furtherevaluation, where P<0.05 for the difference between more and lessaggressive disease, and the direction of association was consistentamong the four studies. These SNPs were subsequently evaluated in anindependent cohort of 1,231 patients with more aggressive disease and1,619 patients with less aggressive disease from the CAPS study (Table4). Six of these 74 SNPs were confirmed; P<0.05 for the allelic test,with the same direction of association (Table 7). These six SNPs werethen evaluated in 1,408 patients with more aggressive disease and 4,318patients with less aggressive disease from the Johns Hopkins Hospital(JHH) study population (Table 5). One SNP (rs4054823 at 17p12) had amarginally different allele frequency between the two types of PCapatients (P=0.051), with the same direction of association as in theprevious studies (Table 8). This SNP was further evaluated in anadditional independent Swedish PCa patient population (PROCAP),comprising 210 patients with more aggressive disease and 4,159 patientswith less aggressive disease. The allelic test confirmed the association(P=0.01).

As summarized in Table 3, the frequency of allele T of SNP rs4054823 wasconsistently higher in patients with more aggressive disease comparedwith patients with less aggressive disease in each of the four CGEMSpopulations, and was significant in the combined allelic test(P=9.8×10⁻⁴). The T allele of rs4054823 was also more frequent inpatients with more aggressive disease in each of the three independentpopulations in the confirmation stage, with a value of P=5.0×10⁻⁴ from acombined allelic test. Combining the data from all seven populations,the allelic test of the SNP and aggressiveness of PCa was highlysignificant (P=2.1×10⁻⁶). When genotype frequencies of this SNP betweenthe two types of PCa were tested using dominant and recessive models,the recessive model (allele T) was most significant (P=2.1×10⁻⁸). This Pvalue exceeded a study-wide significance level at a 5% false positiverate using a conservative Bonferroni correction (27,000 SNPs and threegenetic models). The TT genotype was found in 32% of 4829 cases withaggressive disease and 28% of 12,205 cases with less aggressive disease.Compared with PCa patients who had CC or CT genotypes, patients who hadthe TT genotype of this SNP had an odds ratio (OR) of 1.26 (95%confidence interval [CI], 1.16-1.36) for aggressive PCa. Noheterogeneity was observed in the OR estimates among differentpopulations (P=0.56, Breslow-Day test).

To overcome potential limitations arising from the heterogeneousdefinitions of aggressive PCa used among these seven study populations,and to more fully characterize the association, an in-depth analysis wasperformed of the correlation of SNP rs4054823 with specificclinicopathologic variables of PCa including tumor grade as assessed byGleason score and TNM stage in populations for which this informationwas available. This analysis was first performed in patients from JHHfor the following reasons: (i) a large number of patients (n=5,955)recruited from the same hospital were available; (ii) all patients weretreated with radical prostatectomy and thus, unlike patients receivingeither no or nonsurgical treatment, their tumors were available forextensive pathologic evaluation; and (iii) tumors were uniformly gradedand staged by pathologists at JHH using the same protocol (32, 33). Inthis analysis, it was found that the frequency of the TT genotype waslower in patients with well-to moderately differentiated cancers (29%,28%, and 30% in cancers with Gleason scores ≦6, 3+4, and 4+3,respectively) and increased only in patients with more poorlydifferentiated tumors, i.e., Gleason scores ≧8 (35%), P=0.002 from a χ²test comparing patients with Gleason score ≧8 and <8 (FIG. 2A).Similarly, it was found that the frequency of the TT genotype was lowerin patients with low disease stage (pT2, 29% and pT3a, 28%) and wasincreased in patients with higher disease stage (≧pT3b, 34%; P=0.03,from a χ² test comparing patients with stage ≧pT3b and <pT3b). Thedifference in TT genotype frequency was largest between the most extremegroups with regard to likelihood of disease progression and lethality:29% of patients with the least aggressive disease (Gleason score ≦6 andorgan-confined stage, pT2, n=3,080), compared with 46% of patients withthe most aggressive PCa (Gleason score ≧8 and non-organ-confined stage,≧pT3b, n=136; OR=2.11; 95% CI: 1.507-2.99), P=1.6×10⁻⁵.

The association of this SNP with clinicopathologic variables was alsoexamined in the Swedish CAPS population, although this populationdiffered from the JHH population in that the treatments includedmultiple modalities (none, radiation, surgery, and hormonal), resultingin less uniform tumor staging and grading. In this population, the TTgenotype frequency also increased with increasing Gleason score andstage; the largest difference was between the most and least aggressivePCa patients (FIG. 2B). The pattern of association, however, differedfrom that of JHH: a threshold increase of TT genotype frequency inpatients with Gleason score ≧8 or stage ≧pT3b was observed in the JHHpatients, whereas a gradual increase of TT genotype frequency wasobserved with increasing Gleason score or stage in CAPS patients. Thisdifference may be due to the pathologic evaluation of prostatectomyspecimens in the JHH study versus the clinical grading of biopsyspecimens and clinical staging of the majority of cases in the CAPSstudy. Typically, a ˜20-30% discrepancy in grading and staging isobserved between clinical and pathologic evaluations of the same patient(34).

This study reflects an important shift in genetic association studies ofPCa. Most studies to date have searched for inherited genetic variantsthat predispose men to overall PCa risk, by comparing men with andwithout PCa using a case-control design. In contrast, this study wasstrategically designed to identify inherited genetic markers thatdistinguish between risk for aggressive versus indolent PCa, bycomparing SNPs among PCa patients with these two disease phenotypesusing a case-case design. The need for this change in approach issupported by several trends, including a concern over increased rates ofdiagnosis and treatment of indolent disease and the lack of consistentlyvalidated markers of aggressive disease identified using currently usedcase-control study designs (26).

In this study, a SNP has been identified with a genotype frequency thatis consistently different between patients with more or less aggressivePCa in each of the seven independent populations studied. The differencebetween the two types of PCa was statistically significant (P=2.1×10⁻⁸),exceeding a conservative study-wide and even genome-wide significancelevel. More importantly, the difference in frequency was largest betweenpatients with high-grade, non-organ-confined disease and thus at highrisk for adverse outcomes compared with patients with low-risk,low-grade, organ-confined disease.

It is of interest to note that the frequency of the TT genotype of SNPrs4054823 in unaffected controls is similar to that observed in lessaggressive cases (Table 6), and is significantly higher only among moreaggressive cases. This observation implicates such SNPs as not onlybeing informative of risk for aggressive PCa at the time of diagnosis,but also before diagnosis, to possibly target men for more effective PSAscreening based on their risk for clinically important PCa.

Based on this study, it is envisioned that a panel of SNPs withcharacteristics similar to the one described here could be an importantpart of a genetic-based, targeted PSA screening strategy that iseffective in reducing the number of men requiring disease screening,thereby reducing overdiagnosis while also decreasing mortality byfacilitating identification of those men at risk for aggressive PCa at astage when the disease is potentially curable.

All publications and patent applications, nucleotide sequences and/oramino acid sequences identified by GenBank® Database Accession numbersare herein incorporated by reference to the same extent as if eachindividual publication or patent application or sequences wasspecifically and individually indicated to be incorporated by reference.

Although the foregoing invention has been described in some detail byway of illustration and example for purposes of clarity ofunderstanding, it will be apparent that certain changes andmodifications may be practiced within the scope of the list of theforegoing embodiments and the appended claims.

REFERENCES

-   1. Jemal et al. (2009) Cancer statistics. 2009. CA Cancer J Clin    59:225-249.-   2. Sch{umlaut over (r)}der et al. ERSPC Investigators (2009)    Screening and prostate-cancer mortality in a randomized European    study. N Engl J Med 360:1320-1328.-   3. Andriole et al. PLCO Project Team (2009) Mortality results from a    randomized prostate-cancer screening trial. N Engl J Med    360:1310-1319.-   4. Schaid et al. Investigators of the International Consortium for    Prostate Cancer Genetics (2006) Pooled genome linkage scan of    aggressive prostate cancer: Results from the International    Consortium for Prostate Cancer Genetics. Hum Genet 120:471-485.-   5. Lindström et al. (2007) Familial concordance in cancer survival:    A Swedish population-based study. Lancet Oncol 8:1001-1006.-   6. Amundadottir et al. (2006) A common variant associated with    prostate cancer in European and African populations. Nat Genet    38:652-658.-   7. Freedman et al. (2006) Admixture mapping identifies 8q24 as a    prostate cancer risk locus in African-American men. Proc Natl Acad    Sci USA 103:14068-14073.-   8. Gudmundsson et al. (2007) Genome-wide association study    identifies a second prostate cancer susceptibility variant at 8q24.    Nat Genet 39:631-637.-   9. Yeager et al. (2007) Genome-wide association study of prostate    cancer identifies a second risk locus at 8q24. Nat Genet 39:645-649.-   10. Gudmundsson et al. (2007) Two variants on chromosome 17 confer    prostate cancer risk, and the one in TCF2 protects against type 2    diabetes. Nat Genet 39:977-983.-   11. Thomas et al. (2008) Multiple loci identified in a genome-wide    association study of prostate cancer. Nat Genet 40:310-315.-   12. Gudmundsson et al. (2008) Common sequence variants on 2p15 and    Xp11.22 confer susceptibility to prostate cancer. Nat Genet    40:281-283.-   13. Eeles et al. U K Genetic Prostate Cancer Study Collaborators;    British Association of Urological Surgeons' Section of Oncology; UK    ProtecT Study Collaborators (2008) Multiple newly identified loci    associated with prostate cancer susceptibility. Nat Genet    40:316-321.-   14. Duggan et al. (2007) Two genome-wide association studies of    aggressive prostate cancer implicate putative prostate tumor    suppressor gene DAB2IP. J Natl Cancer Inst 99:1836-1844.-   15. Haiman et al. (2007) Multiple regions within 8q24 independently    affect risk for prostate cancer. Nat Genet 39:638-644.-   16. Zheng et al. (2007) Additive effects of two unlinked loci at    8q24 are associated with a considerable fraction of prostate cancer    among European Americans. J Natl Cancer Inst 99:1525-1533.-   17. Sun et al. (2008) Evidence for two independent prostate cancer    risk-associated loci in the HNF1B gene at 17q12. Nat Genet    40:1153-1155.-   18. Sun et al. (2009) Sequence variants at 22q13 are associated with    prostate cancer risk. Cancer Res 69:10-15.-   19. Chang et al. (2009) Fine mapping association study and    functional analysis implicate a SNP in MSMB at 10q11 as a causal    variant for prostate cancer risk. Hum Mol Genet 18:1368-1375.-   20. Hsu et al. (2009) A novel prostate cancer susceptibility locus    at 19q13. Cancer Res 69:2720-2723.-   21. Zheng et al. (2009) Two independent prostate cancer    risk-associated loci at 11q13. Cancer Epidemiol Biomarkers Prev    18:1815-1820.-   22. Yeager et al. (2009) Identification of a new prostate cancer    susceptibility locus on chromosome 8q24. Nat Genet 41:1055-1057.-   23. Gudmundsson et al. (2009) Genome-wide association and    replication studies identify four variants associated with prostate    cancer susceptibility. Nat Genet 41:1122-1126.-   24. Eeles et al. U K Genetic Prostate Cancer Study    Collaborators/British Association of Urological Surgeons' Section of    Oncology; UK ProtecT Study Collaborators; PRACTICAL    Consortium (2009) Identification of seven new prostate cancer    susceptibility loci through a genome-wide association study. Nat    Genet 41:1116-1121.-   25. Al Olama et al. U K Genetic Prostate Cancer Study    Collaborators/British Association of Urological Surgeons' Section of    Oncology; UK Prostate testing for cancer and Treatment study    (ProtecT Study) Collaborators (2009) Multiple loci on 8q24    associated with prostate cancer susceptibility. Nat Genet    41:1058-1060.-   26. Kader et al. (2009) Individual and cumulative effect of prostate    cancer risk-associated variants on clinicopathologic variables in    5,895 prostate cancer patients. Prostate 69:1195-1205.-   27. Kote-Jarai et al. PRACTICAL Consortium (2008) Multiple novel    prostate cancer predisposition loci confirmed by an international    study: The PRACTICAL Consortium. Cancer Epidemiol Biomarkers Prev    17:2052-2061.-   28. Fitzgerald et al. (2009) Analysis of recently identified    prostate cancer susceptibility loci in a population-based study:    Associations with family history and clinical features. Clin Cancer    Res 15:3231-3237.-   29. Wiklund et al. (2009) Established prostate cancer susceptibility    variants are not associated with disease outcome. Cancer Epidemiol    Biomarkers Prev 18:1659-1662.-   30. Gelmann. (2008) Complexities of prostate-cancer risk. N Engl J    Med 358:961-963.-   31. Witte J S (2009) Prostate cancer genomics: Towards a new    understanding. Nat Rev Genet 10:77-82.-   32. Epstein et al. ISUP Grading Committee (2005) The 2005    International Society of Urological Pathology (ISUP) Consensus    Conference on Gleason Grading of Prostatic Carcinoma. Am J Surg    Pathol 29:1228-1242.-   33. Hoedemaeker et al. (2000) Staging prostate cancer. Microsc Res    Tech 51:423-429.-   34. Lotan and Epstein. (2009) Gleason grading of prostatic    adenocarcinoma with glomeruloid features on needle biopsy. Hum    Pathol 40:471-477.-   35. Cheng et al. (2008) 8q24 and Prostate cancer: Association with    advanced disease and meta-analysis. Eur J Hum Genet 16:496-505.-   36. Helfand et al. (2008) Tumor characteristics of carriers and    noncarriers of the deCODE 8q24 prostate cancer susceptibility    alleles. J Urol, 179:2197-2201.-   37. Kraft and Hunter. (2009) Genetic risk prediction—are we there    yet? N Engl J Med 360:1701-1703.-   38. Cooperberg et al. (2009) Risk assessment for prostate cancer    metastasis and mortality at the time of diagnosis. J Natl Cancer    Inst 101:878-887.-   39. Lin. (2004) Functions of heparan sulfate proteoglycans in cell    signaling during development. Development 131:6009-6021.-   40. Stephenson et al. (2009) Prostate cancer-specific mortality    after radical prostatectomy for patients treated in the    prostate-specific antigen era. J Clin Oncol 27:4300-4305.-   41. Zheng et al. (2008) Cumulative association of five genetic    variants with prostate cancer. N Engl J Med 358:910-919.-   42. Adolfsson et al. (2007) Clinical characteristics and primary    treatment of prostate cancer in Sweden between 1996 and 2005. Scand    J Urol Nephrol 41:456-477.-   43. Stattin et al. National Prostate Cancer Register (2008)    Surveillance and deferred treatment for localized prostate cancer.    Population based study in the National Prostate Cancer Register of    Sweden. J Urol, 180:2423-2429.

TABLE 1 Risk Allele Distribution of 183 Aggressive Versus 184Non-Aggressive Prostate Cases in a Johns Hopkins Hospital PopulationRisk Frequency in aggressive Frequency in non-aggressive SNP CHRPosition Alleles allele prostate cancer prostate cancer OR rs17641637 1712957061 C/T T 0.67 0.67 1.03 rs11654550 17 13499770 T/C C 0.59 0.561.15 rs2190856 17 13502089 T/G G 0.58 0.55 1.13 rs7215323 17 13509789A/G A 0.43 0.36 1.39 rs7215137 17 13509982 A/G G 0.59 0.58 1.06rs12948596 17 13511800 T/C T 0.30 0.29 1.01 rs62056886 17 13512134 C/T T0.53 0.50 1.14 rs62056887 17 13512555 A/G A 0.23 0.21 1.13 rs9890022 1713516145 T/C T 0.26 0.24 1.12 rs9892382 17 13516433 T/C C 0.60 0.58 1.10rs58402698 17 13516645 G/A G 0.26 0.24 1.14 rs9898581 17 13517421 G/C G0.26 0.25 1.08 rs8077904 17 13519696 C/G G 0.60 0.58 1.11 rs9914411 1713520722 G/C G 0.22 0.21 1.06 rs9916271 17 13520827 A/T A 0.26 0.23 1.15rs9895696 17 13522216 A/C A 0.26 0.24 1.13 rs9896834 17 13522666 C/T T0.60 0.58 1.12 rs2874922 17 13523506 T/C C 0.61 0.58 1.12 rs13342347 1713523600 A/C A 0.26 0.23 1.16 rs13342371 17 13523675 T/G T 0.26 0.231.16 rs8071527 17 13523773 G/A A 0.60 0.58 1.11 rs9899320 17 13526026C/G C 0.26 0.24 1.14 rs55904171 17 13526981 C/G C 0.26 0.24 1.14rs9909795 17 13527634 C/T C 0.25 0.24 1.11 rs62056948 17 13530709 A/G A0.25 0.24 1.09 rs55777305 17 13531676 G/A G 0.30 0.29 1.01 rs12602893 1713534884 T/G T 0.25 0.21 1.25 rs11078175 17 13536856 C/T T 0.61 0.581.13 rs62056953 17 13536977 G/A G 0.26 0.23 1.15 rs4622548 17 13538724C/T T 0.57 0.52 1.23 rs28824801 17 13541019 G/A G 0.26 0.23 1.15rs12325885 17 13543663 T/C C 0.60 0.57 1.11 rs9910556 17 13544834 T/C C0.60 0.58 1.10 rs17588248 17 13548141 G/C G 0.26 0.24 1.13 rs11078178 1713548190 A/T T 0.60 0.58 1.11 rs17588297 17 13548205 T/C T 0.25 0.211.21 rs2874927 17 13548602 C/T C 0.25 0.21 1.21 rs8074120 17 13549576A/G G 0.60 0.58 1.10 rs9908002 17 13549722 C/T C 0.26 0.23 1.17rs59486592 17 13550210 A/G A 0.25 0.23 1.13 rs4791554 17 13552900 G/A G0.25 0.22 1.20 rs11656731 17 13552932 T/A T 0.25 0.21 1.21 rs56662934 1713555214 A/G A 0.26 0.23 1.18 rs12949913 17 13556021 G/T T 0.59 0.571.07 rs12453942 17 13560013 C/G C 0.25 0.21 1.24 rs13353193 17 13562260A/G A 0.25 0.22 1.20 rs9911679 17 13562733 G/A A 0.78 0.73 1.34rs11078179 17 13563342 G/T T 0.79 0.74 1.30 rs12942445 17 13564329 C/T T0.87 0.86 1.13 rs12940830 17 13564583 G/A G 0.19 0.16 1.17 rs17665271 1713564994 C/T T 0.81 0.78 1.23 rs56216350 17 13565289 C/T T 0.81 0.781.23 rs16948318 17 13565430 A/G G 0.78 0.73 1.35 rs4054823 17 13565749C/T T 0.61 0.57 1.21 rs12942294 17 13566080 T/G G 0.78 0.72 1.39rs12942086 17 13566150 T/C C 0.78 0.73 1.38

TABLE 2 Number of Patients with More or Less Aggressive Prostate Cancerin Each of Seven Populations No. of Prostate Cancer Patients StudyPopulation More Aggressive Less Aggressive CGEMS* PLCO 691 489ACS(CPS-II) 926 699 HPFS 123 405 ATBC 240 516 Subtotal 1,980 2,109CAPS^(†) 1,231 1,619 JHH^(‡) 1,408 4,318 PROCAP^(§) 210 4,159 Total4,829 12,205 *In the CGEMS study, more aggressive disease is defined asGleason ≧ 7 or T-stage ≧ T3. ^(†)n the CAPS study, more aggressivedisease is defined as Gleason ≧ 8 or T-stage ≧ T3. ^(‡)In the JHH study,more aggressive disease is defined as Gleason ≧ (4 + 3) or T-stage ≧ T3bor N+. ^(§)In the PROCAP study, more aggressive disease is defined asGleason ≧ 8 or N+.

TABLE 3 Association of SNP rs4054823 at 17p12 with Aggressiveness of PCaGenotype Frequency Allele Test Study Aggressive Nonaggressive Frequency(T) Populations CC CT TT CC CT TT Agg Nonagg OR (95% CI) P CGEMS studyACS 171 467 275 152 349 183 0.56 0.52 1.15 (1.00-1.32) 0.05 ATBC 52 11967 124 253 132 0.53 0.51 1.10 (0.88-1.37) 0.39 HPFS 29 43 46 75 191 1230.57 0.56 1.04 (0.78-1.40) 0.78 PLCO 119 332 233 104 253 126 0.58 0.521.28 (1.08-1.51) 3.7E−03 Sub Total 371 961 621 455 1046 564 0.56 0.531.17 (1.06-1.28) 9.8E−04 Confirmation CAPS 247 589 387 331 841 428 0.560.52 1.11 (1.00-1.24) 0.04 JHH 289 662 448 912 2152 1217 0.56 0.54 1.09(1.00-1.19) 0.05 PROCAP 35 93 81 853 2079 1215 0.61 0.54 1.31(1.07-1.61) 0.01 Sub Total 571 1344 916 2096 5072 2860 0.56 0.54 1.12(1.05-1.19) 5.0E−04 All Populations 942 2305 1537 2551 6118 3424 0.560.54 1.13 (1.08-1.19) 2.1E−06 Genotype Test Study Recessive DominantPopulations OR (95% CI) P OR (95% CI) P CGEMS study ACS 1.18 (0.95-1.47)0.14 1.24 (0.97-1.58) 0.09 ATBC 1.12 (0.79-1.58) 0.52 1.15 (0.80-1.66)0.45 HPFS 1.38 (0.90-2.12) 0.14 0.73 (0.45-1.20) 0.21 PLCO 1.46(1.13-1.89) 3.6E−03 1.30 (0.97-1.75) 0.08 Sub Total 1.27 (1.10-1.47)9.1E−04 1.18 (1.00-1.38) 0.04 Confirmation CAPS 1.27 (1.08-1.49) 4.5E−031.03 (0.86-1.24) 0.75 JHH 1.19 (1.04-1.35) 1.0E−02 1.04 (0.90-1.21) 0.61PROCAP 1.53 (1.15-2.03) 3.5E−03 1.29 (0.89-1.87) 0.18 Sub Total 1.25(1.13-1.37) 6.2E−06 1.06 (0.95-1.18) 0.32 All Populations 1.26(1.16-1.36) 2.1E−08 1.09 (1.00-1.20) 0.05 Recessive and dominant modelsare defined in terms of risk allele T. For Subtotal and All Populations,the P value or OR (95% CI) were calculated from the CMH test.Breslow-Day P value for all populations/recessive mode is 0.5646.

TABLE 4 Clinical and Demographic Characteristics of Subjects in CAPS No.(%) of cases Aggressive Localized All cases No. (%) of controlsCharacteristic (n = 1,231) (n = 1,619) (n = 2,899) (n = 1,722) Age atenrollment (y) Mean (SD) 68.04 (7.32)   65.14 (6.74)   66.36 (7.13)  67.15 (7.39)   Age, y, at diagnosis 65  514 (41.75) 926 (57.19) 1469(50.78)  N/A >65  717 (58.25) 693 (42.81) 1424 (49.22)  N/A Familyhistory (first-degree relatives) No 1013 (82.29)  1295 (79.99)  2,342(80.95)  1,565 (90.57)  Yes 218 (17.71) 324 (20.01) 551 (19.05) 163(9.43)  Missing data  0 0  0 0 PSA levels at diagnosis for cases or atenrollment for controls (ng/mL) 4 36 (2.95) 185 (11.61) 221 (7.85) 1,438 (83.56)  4.01-9.99   171 (14.00) 755 (47.39) 926 (32.91) 230(13.36) 10-19.99 216 (17.69) 438 (27.50) 654 (23.24) 37 (2.15) 20-49.99252 (20.64) 215 (13.50) 467 (16.60) 13 (0.76) 50-99.99 229 (18.76) 0 229(8.14)   2 (0.12) 100  317 (25.96) 0 317 (11.27)  1 (0.06) Missing  1026  85 1 T-stage T0  2 (0.16)  7 (0.44)  9 (0.32) N/A T1 147 (12.07) 933(58.24) 1080 (38.30)  N/A T2 242 (19.87) 662 (41.32) 904 (32.06) N/A T3724 (59.44) 0 724 (25.67) N/A T4 103 (8.46)  0 103 (3.65)  N/A TX  1317  79 N/A N-stage N0 222 (70.03)  302 (100.00) 524 (84.65) N/A N1  95(29.97) 0  95 (15.35) N/A NX 914 1317   2280  N/A M-stage M0 589 (68.25) 655 (100.00) 1244 (81.95)  N/A M1 274 (31.75) 0 274 (18.05) N/A MX 368964  1381  N/A Gleason (biopsy) 4  9 (0.83) 98 (6.32) 107 (4.06)  N/A 543 (3.96) 247 (15.93) 290 (10.99) N/A 6 153 (14.08) 832 (53.64) 985(37.34) N/A 7 414 (38.09) 374 (24.11) 788 (29.87) N/A 8 258 (23.74) 0258 (9.78)  N/A 9 185 (17.02) 0 185 (7.01)  N/A 10  25 (2.30) 0 25 N/AMissing 144 68  261  N/A Forty-nine patients could not be classified ashaving aggressive or localized disease because of missing phenotypes.

TABLE 5 Clinical and Demographic Characteristics of Study Subjects No.(%) of cases Aggressive Indolent All cases Controls Characteristic (n =1,408) (n = 4,318) (n = 5,955) (n = 482) Age at enrollment (y) Mean (SD)59.8 (6.72)  57.7 (6.49) 58.3 (6.69)  59.91 (7.19) Age at diagnosis (y)≦65    1,112 (78.98)  3,833 (88.77) 5,115 (85.89)  >65  296 (21.02)  485(11.23)  839 (1409) PSA levels at diagnosis for cases or at enrollmentfor controls (ng/mL) ≦4    139 (9.87)  1,095 (25.36) 1,262 (21.19)   481 (99.79) 4.01-9.99   611 (43.39) 2,264 (52.43) 2,951 (49.55)  010-19.99 182 (12.93)  247 (5.72) 451 (7.57) 0 20-49.99 84 (5.97)  36(0.83) 131 (2.2)  0 50-99.99 34 (2.41)   4 (0.09)  58 (0.97) 0 ≧100    63 (4.47)   3 (0.07) 117 (1.96) 0 Missing 196 (13.92)  669 (15.49)  985(16.54)    1 (0.21) T-stage T0 NA NA NA NA T1 NA NA NA NA T2 368 (26.14)3,416 (79.11) 3,850 (64.65)  NA T3a 536 (38.07)  902 (20.89) 1,454(24.42)  NA T3b/c 355 (25.21) 0 355 (5.96) NA T3/T3X  9 (0.64) 0  15(0.25) NA T4  3 (0.21) 0  3 (0.05) NA TX 137 (9.73)  0 278 (4.67) NAN-stage N0 1,085 (77.06)  4,318 (100)  5,469 (91.84)  NA N1 140 (9.94) 0 (0) 140 (2.35) NA NX 183 (13)   0 (0) 346 (5.81) NA M-stage M0 NA NANA NA M1 NA NA NA NA MX 1,408   4,318   5,955 NA Gleason score (biopsy)≦4    0 0  2 (0.03) NA 5  2 (0.14)  67 (1.55)  73 (1.23) NA 6 23 (1.63)3,042 (70.45) 3,104 (52.12)  NA 7 (3 + 4) 106 (7.53)  1,254 (29.04)1,411 (23.69)  NA 7 (4 + 3) 667 (47.37) 0 667 (11.2) NA 8 317 (22.51) 0317 (5.32) NA 9 265 (18.82) 0 265 (4.45) NA 10  18 (1.28) 0 18 (0.3) NAMissing 10 (0.71) 0  98 (1.65) NA A total of 229 patients could not beclassified as having aggressive or indolent disease because of missingphenotypes.

TABLE 6 Genotype Frequency of SNP rs4054823 at 17p12 in Controls as Wellas Case Patients With Aggressive or Indolent Disease Genotype frequencyStudy Controls Aggressive Indolent population CC CT TT CC CT TT CC CT TTCGEMS study AC5 339 904 532 171 467 275 152 349 183 ATBC 228 473 219 52119 67 124 253 132 HPFS 126 304 181 29 43 46 75 191 123 PLCO 226 548 319119 312 233 104 253 126 Sub total 919 2,229 1,251 371 961 621 455 1,046564 Confirmation CAP5 362 865 484 247 589 387 331 841 428 JHH 106 234140 289 662 448 912 2,152 1,217 Sub total 468 1,099 624 536 1,251 8351,243 2,993 1,645 All populations 1,387 3,328 1,875 907 2,212 1,4561,698 4,039 2,209 Genotype test (recessive model for T) Controls vs.Controls vs. Study aggressive indolent population OR (95% CI) P OR (95%CI) P CGEMS study AC5 1.01 (0.85-1.20) 0.937 0.85 (0.70-1.04) 0.116 ATBC1.25 (0.91-1.73) 0.166 1.12 (0.87-1.44) 0.371 HPFS 1.52 (1.01-2.28)0.045 1.10 (0.83-1.45) 0.504 PLCO 1.25 (1.02-1.54) 0.031 0.86(0.67-1.09) 0.208 Sub total 1.15 (1.02-1.29) 0.019 0.95 (0.84-1.08)0.386 Confirmation CAP5 1.17 (1.00-1.37) 0.050 0.93 (0.79-1.08) 0.322JHH 1.14 (0.91-1.44) 0.244 0.96 (0.78-1.19) 0.734 Sub total 1.16(1.02-1.33) 0.023 0.94 (0.83-1.06) 0.318 All populations 1.16(1.06-1.26) 1.1E−03 0.94 (0.87-1.03) 0.188 P value and OR (95% CI) incombined populations are for the CMH test. In Controls vs. Aggressive,the Breslow-Day P value for all populations is 0.4142.

That which is claimed is:
 1. A method of identifying a human subject ashaving an increased risk of developing aggressive prostate cancer,comprising detecting in a nucleic acid sample from the subject a Tallele at single nucleotide polymorphism rs4054823 in chromosome region17p12, wherein the detection of said allele identifies the subject ashaving an increased risk of developing aggressive prostate cancer.
 2. Amethod of identifying a human subject as having an increased risk ofdeveloping aggressive prostate cancer, comprising detecting in a nucleicacid sample from the subject an allele that is in linkage disequilibriumwith the T allele at single nucleotide polymorphism rs4054823 inchromosome region 17p12, wherein the detection of said allele identifiesthe subject as having an increased risk of developing aggressiveprostate cancer.
 3. The method of claim 1, wherein the subject ishomozygous for the T allele at single nucleotide polymorphism rs4054823.4. The method of claim 1, wherein detecting is carried out by anamplification reaction.
 5. The method of claim 1, wherein detecting iscarried out by an amplification reaction and single base extension. 6.The method of claim 5, wherein the product of the amplification reactionand single base extension is spotted on a silicone chip.
 7. The methodof claim 1, wherein detecting is carried out by matrix-assisted laserdesorption/ionization-time of flight mass spectrometry (MALDI-TOF-MS).8. The method of claim 4, wherein the amplification reaction is apolymerase chain reaction.
 9. The method of claim 1, wherein detectingis carried out by sequencing, hybridization, restriction endonucleasedigestion analysis, electrophoresis, or any combination thereof.
 10. Acomputer-assisted method of identifying a proposed treatment foraggressive prostate cancer as an effective and/or appropriate treatmentfor a subject carrying a genetic marker correlated with aggressiveprostate cancer, comprising the steps of: (a) storing a database ofbiological data for a plurality of subjects, the biological data that isbeing stored including for each of said plurality of subjects: (i) atreatment type, (ii) at least one genetic marker associated withaggressive prostate cancer, and (iii) at least one disease progressionmeasure for prostate cancer from which treatment efficacy can bedetermined; and then (b) querying the database to determine thedependence on said genetic marker of the effectiveness of a treatmenttype in treating prostate cancer, thereby identifying a proposedtreatment as an effective and/or appropriate treatment for a subjectcarrying a genetic marker correlated with prostate cancer.
 11. Themethod of claim 10, wherein the genetic marker associated withaggressive prostate cancer is a T allele in single nucleotidepolymorphism rs4054823 in chromosome region 17p12.
 12. The method ofclaim 1, wherein the subject has an elevated prostate serum antigenlevel.
 13. The method of claim 1, wherein the subject has a familyhistory of prostate cancer.
 14. A kit comprising oligonucleotides todetect the T allele of single nucleotide polymorphism rs4054823 inchromosome region 17p12 and/or a risk allele of a single nucleotidepolymorphism in linkage disequilibrium with single nucleotidepolymorphism rs4054823 in chromosome region 17p12 in a nucleic acidsample.